ECGpred: Correlation and Prediction of Gene Expression from Nucleotide Sequence
نویسندگان
چکیده
Development of gene expression prediction systems from huge amount of microarray data is an inevitable problem. In the present study a support vector machine (SVM) based method has been developed to predict expression of genes from its nucleotide sequence. In this method, SVM was trained on microarray data of genes and trained SVM was used to predict the expression of other genes of the same organism under the same condition. The SVM models were developed using nucleotide, dinucleotide, and trinucleotide composition of genes and achieved correlation coefficients (r) 0.25, 0.70, 0.82 respectively, between predicted and experimentally determined gene expression. Besides, trinucleotide composition, we also tried codon composition in each forward reading frame and achieved the correlation r = 0.86, 0.83 and 0.73 between the predicted and the actual expression using trinucleotide composition from the first, second and third frames respectively. The method was developed on 4807 genes of Saccharomyces cerevisiae obtained from Holstege et al., (1998) and evaluated using 5-fold cross validation techniques. A web server ECGpred has been developed to allow users to understand the relationship between expression and various components of genes like coding/non-coding regions, transcription factor (http://www.imtech.res.in/raghava/ecgpred/).
منابع مشابه
Cloning and Characterization of cbhII Gene fromTrichoderma parceramosum and Its Expressionin Pichia pastoris
The genomic and cDNA clones encoding cellobiohydrolase II (CBHII) have been isolated and sequenced from a native Iranian isolate of Trichoderma parceramosum, a high cellulolytic enzymes producer isolate. This represents the first report of cbhII gene from this organism. Comparison of genomic and cDNA sequences indicates this gene contains three short introns and also an open reading frame codin...
متن کاملExpression of Prunus Necrotic Ringspot Virus Coat Protein in E. coli
Background and Amis: Serological assay is considered as one of the best choices for conducting large number of infection tests. Recombinant DNA technology has been used for expression of virus coat protein (CP) gene in prokaryotic bacterial cells such as Escherichia coli and the recombinant CP (rCP) is used as immunogen in antibody production. Heterologous CP protein expression and purification...
متن کاملPrediction of Blasting Cost in Limestone Mines Using Gene Expression Programming Model and Artificial Neural Networks
The use of blasting cost (BC) prediction to achieve optimal fragmentation is necessary in order to control the adverse consequences of blasting such as fly rock, ground vibration, and air blast in open-pit mines. In this research work, BC is predicted through collecting 146 blasting data from six limestone mines in Iran using the artificial neural networks (ANNs), gene expression programming (G...
متن کاملThe vlhA gene sequencing of Iranian Mycoplasma synoviae isolates
Mycoplasma synoviae expressed variable lipoprotein haemagglutinin (VlhA) is believed to play a major role in pathogenesis of the disease by mediating adherence and immune evasion. The aim of this study was sequencing Iranian M. synoviae isolates for the detection of nucleotide variation in the M. synoviae vlhA gene. Using oligonucleotide primers complementary to the single-copy conserved 5´ end...
متن کاملPhylogenetic analysis and genetic variation of Tomato yellow leaf curl virus based on the V1 gene in Iraq
Tomato yellow leaf curl virus (TYLCV) is a supreme pathogen in tropical and subtropical areas. During 2014-2015, a total of 393 tomato samples showing Tomato yellow leaf curl disease (TYLCD) symptoms were collected from six different provinces of Iraq. In serological assays, 55 out of 393 samples (14%) reacted positively with TYLCV-specific antibodies .The presence of TYLCV was verified in 21 (...
متن کامل